Analysis of Statistical Hypothesis based Learning Mechanism for Faster Crawling

نویسندگان

  • Sudarshan Nandy
  • Partha Pratim Sarkar
  • Achintya Das
چکیده

The growth of world-wide-web (WWW) spreads its wings from an intangible quantities of web-pages to a gigantic hub of web information which gradually increases the complexity of crawling process in a search engine. A search engine handles a lot of queries from various parts of this world, and the answers of it solely depend on the knowledge that it gathers by means of crawling. The information sharing becomes a most common habit of the society, and it is done by means of publishing structured, semi-structured and unstructured resources on the web. This social practice leads to an exponential growth of web-resource, and hence it became essential to crawl for continuous updating of web-knowledge and modification of several existing resources in any situation. In this paper one statistical hypothesis based learning mechanism is incorporated for learning the behaviour of crawling speed in different environment of network, and for intelligently control of the speed of crawler. The scaling technique is used to compare the performance proposed method with the standard crawler. The high speed performance is observed after scaling, and the retrieval of relevant web-resource in such a high speed is analysed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of a Statistical Hypothesis Based Learning Mechanism for Faster crawling

3.4 ,"" BLOCKIN0122$%"01$%0!11"$$ "$ BLOCKIN0," BLOCKIN01 BLOCKIN BLOCKIN$!! !!!1.0120"" BLOCKIN".

متن کامل

Analysis of the No Return Point Hypothesis: The Effect of Audio and Visual Stimuli in the Fast Movements Inhibition

Background. The No Return Point hypothesis is one of the research areas that has been done in line with the motor program. In this hypothesis emphasized an inability to inhibition move after its start by the motor program. Several factors are affecting the mechanism of this inhibition. Objectives. In this study, we investigate the effects of audio and visual stimuli on blocking quick moves to ...

متن کامل

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

Sparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains

In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...

متن کامل

Task Effectiveness Predictors: Technique Feature Analysis VS. Involvement Load Hypothesis

How deeply a word is processed has long been considered as a crucial factor in the realm of vocabulary acquisition. In literature, two frameworks have been proposed to operationalize the depth of processing, namely the Involvement Load Hypothesis (ILH) and the Technique Feature Analysis (TFA). However, they differ in the way they have operationalized it specially in terms of their attentional c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1208.2808  شماره 

صفحات  -

تاریخ انتشار 2012